Semantic genetic programming for fast and accurate data knowledge discovery

نویسندگان

  • Mauro Castelli
  • Leonardo Vanneschi
  • Luca Manzoni
  • Ales Popovic
چکیده

Big data knowledge discovery emerged as an important factor contributing to advancements in society at large. Still, researchers continuously seek to advance existing methods and provide novel ones for analysing vast data sets to make sense of the data, extract useful information, and build knowledge to inform decision making. In the last few years, a very promising variant of genetic programming was proposed: geometric semantic genetic programming. Its difference with the standard version of genetic programming consists in the fact that it uses new genetic operators, called geometric semantic operators, that, acting directly on the semantics of the candidate solutions, induce by definition a unimodal error surface on any supervised learning problem, independently from the complexity and size of the underlying data set. This property should improve the evolvability of genetic programming in presence of big data and thus makes geometric semantic genetic programming an extremely promising method for mining vast amounts of data. Nevertheless, to the best of our knowledge, no contribution has appeared so far to employ this new technology to big data problems. This paper intends to fill this gap. For the first time, in fact, we show the effectiveness of geometric semantic genetic programming on several complex real-life problems, characterized by vast amounts of data, coming from several different application domains. & 2015 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cluster Based Cross Layer Intelligent Service Discovery for Mobile Ad-Hoc Networks

The ability to discover services in Mobile Ad hoc Network (MANET) is a major prerequisite. Cluster basedcross layer intelligent service discovery for MANET (CBISD) is cluster based architecture, caching ofsemantic details of services and intelligent forwarding using network layer mechanisms. The cluster basedarchitecture using semantic knowledge provides scalability and accuracy. Also, the mini...

متن کامل

A New Correlation Based on Multi-Gene Genetic Programming for Predicting the Sweet Natural Gas Compressibility Factor

Gas compressibility factor (z-factor) is an important parameter widely applied in petroleum and chemical engineering. Experimental measurements, equations of state (EOSs) and empirical correlations are the most common sources in z-factor calculations. However, these methods have serious limitations such as being time-consuming as well as those from a computational point of view, like instabilit...

متن کامل

A Fast and Self-Repairing Genetic Programming Designer for Logic Circuits

Usually, important parameters in the design and implementation of combinational logic circuits are the number of gates, transistors, and the levels used in the design of the circuit. In this regard, various evolutionary paradigms with different competency have recently been introduced. However, while being advantageous, evolutionary paradigms also have some limitations including: a) lack of con...

متن کامل

Scalable Link Discovery for Modern Data-Driven Applications

The constant growth of volume and velocity of knowledge bases on the Linked Data Web has led to an increasing need for scalable linking techniques between resources. Modern data-driven applications often have to integrate large amounts of data relaying on fast but accurate Link Discovery solutions. Hence, they often operate under time or space constraints. Additionally, most Link Discovery fram...

متن کامل

Estimation of Discharge over the Submerged Compound Sharp-Crested Weir using Artificial Neural Networks and Genetic Programming

Truncated sharp crested weirs are used to measure flow rate and control upstream water surface in irrigation canals and laboratory flumes. The main advantages of such weirs are ease of construction and capability of measuring a wide range of flows with sufficient accuracy. Artificial neural networks (ANNs) and genetic programming (GP) have recently been used for estimation of hydraulic data. In...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Swarm and Evolutionary Computation

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2016